|
In computer science, approximate string matching (often colloquially referred to as fuzzy string searching) is the technique of finding strings that match a pattern approximately (rather than exactly). The problem of approximate string matching is typically divided into two sub-problems: finding approximate substring matches inside a given string and finding dictionary strings that match the pattern approximately. == Overview == The closeness of a match is measured in terms of the number of primitive operations necessary to convert the string into an exact match. This number is called the edit distance between the string and the pattern. The usual primitive operations are: * insertion: ''cot'' → ''coat'' * deletion: ''coat'' → ''cot'' * substitution: ''coat'' → ''cost'' These three operations may be generalized as forms of substitution by adding a NULL character (here symbolized by *) wherever a character has been deleted or inserted: * insertion: ''co *t'' → ''coat'' * deletion: ''coat'' → ''co *t'' * substitution: ''coat'' → ''cost'' Some approximate matchers also treat ''transposition'', in which the positions of two letters in the string are swapped, to be a primitive operation. Changing ''cost'' to ''cots'' is an example of a transposition. Different approximate matchers impose different constraints. Some matchers use a single global unweighted cost, that is, the total number of primitive operations necessary to convert the match to the pattern. For example, if the pattern is ''coil'', ''foil'' differs by one substitution, ''coils'' by one insertion, ''oil'' by one deletion, and ''foal'' by two substitutions. If all operations count as a single unit of cost and the limit is set to one, ''foil'', ''coils'', and ''oil'' will count as matches while ''foal'' will not. Other matchers specify the number of operations of each type separately, while still others set a total cost but allow different weights to be assigned to different operations. Some matchers permit separate assignments of limits and weights to individual groups in the pattern. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「approximate string matching」の詳細全文を読む スポンサード リンク
|